ED410316 1996-12-00 Norm- and 
Criterion-Referenced Testing. ERIC/AE 

Digest. 

ERIC Development Team 

www . eric . ed . gov 

Table of Contents 

If you're viewing this document online, you can click any of the topics below to link directly to that section. 

Norm- and Criterion-Referenced Testing. ERIC/AE Digest 1 

INTENDED PURPOSES 2 

SELECTION OF TEST CONTENT 3 

TEST INTERPRETATION 3 

SUMMARY 4 

REFERENCES 4 

ERIC ||1 Digests 

ERIC Identifier: ED410316 
Publication Date: 1996-12-00 
Author: Bond, Linda A. 

Source: ERIC Clearinghouse on Assessment and Evaluation Washington DC. 

Norm- and Criterion-Referenced Testing. 
ERIC/AE Digest. 

THIS DIGEST WAS CREATED BY ERIC, THE EDUCATIONAL RESOURCES 
INFORMATION CENTER. FOR MORE INFORMATION ABOUT ERIC, CONTACT 
ACCESS ERIC 1 -800-LET-ERIC 

Tests can be categorized into two major groups: norm-referenced tests and 
criterion-referenced tests. These two tests differ in their intended purposes, the way in 
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which content is selected, and the scoring process which defines how the test results 
must be interpreted. This brief paper will describe the differences between these two 
types of assessments and explain the most appropriate uses of each. 

INTENDED PURPOSES 



The major reason for using a norm-referenced tests (NRT) is to classify students. NRTs 
are designed to highlight achievement differences between and among students to 
produce a dependable rank order of students across a continuum of achievement from 
high achievers to low achievers (Stiggins, 1994). School systems might want to classify 
students in this way so that they can be properly placed in remedial or gifted programs. 
These types of tests are also used to help teachers select students for different ability 
level reading or mathematics instructional groups. 

With norm-referenced tests, a representative group of students is given the test prior to 
its availability to the public. The scores of the students who take the test after 
publication are then compared to those of the norm group. Tests such as the California 
Achievement Test (CTB/McGraw-Hill), the Iowa Test of Basic Skills (Riverside), and the 
Metropolitan Achievement Test (Psychological Corporation) are normed using a 
national sample of students. Because norming a test is such an elaborate and 
expensive process, the norms are typically used by test publishers for 7 years. All 
students who take the test during that seven year period have their scores compared to 
the original norm group. 

While norm-referenced tests ascertains the rank of students, criterion-referenced tests 
(CRTs) determine "...what test takers can do and what they know, not how they 
compare to others (Anastasi, 1988, p. 102). CRTs report how well students are doing 
relative to a pre-determined performance level on a specified set of educational goals or 
outcomes included in the school, district, or state curriculum. 

Educators or policy makers may choose to use a CRT when they wish to see how well 
students have learned the knowledge and skills which they are expected to have 
mastered. This information may be used as one piece of information to determine how 
well the student is learning the desired curriculum and how well the school is teaching 
that curriculum. 

Both NRTs and CRTs can be standardized. The U.S. Congress, Office of Technology 
Assessment (1992) defines a standardized test as one that uses uniform procedures for 
administration and scoring in order to assure that the results from different people are 
comparable. Any kind of test-from multiple choice to essays to oral examinations-can 
be standardized if uniform scoring and administration are used (p. 165). This means 
that the comparison of student scores is possible. Thus, it can be assumed that two 
students who receive the identical scores on the same standardized test demonstrate 
corresponding levels of performance. Most national, state and district tests are 
standardized so that every score can be interpreted in a uniform manner for all students 
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and schools. 

SELECTION OF TEST CONTENT 



Test content is an important factor choosing between an NRT test and a CRT test. The 
content of an NRT test is selected according to how well it ranks students from high 
achievers to low. The content of a CRT test is determined by how well it matches the 
learning outcomes deemed most important. Although no test can measure everything of 
importance, the content selected for the CRT is selected on the basis of its significance 
in the curriculum while that of the NRT is chosen by how well it discriminates among 
students. 

Any national, state or district test communicates to the public the skills that students 
should have acquired as well as the levels of student performance that are considered 
satisfactory. Therefore, education officials at any level should carefully consider content 
of the test which is selected or developed. Because of the importance placed upon high 
scores, the content of a standardized test can be very influential in the development of a 
school's curriculum and standards of excellence. 

NRTs have come under attack recently because they traditionally have purportedly 
focused on low level, basic skills. This emphasis is in direct contrast to the 
recommendations made by the latest research on teaching and learning which calls for 
educators to stress the acquisition of conceptual understanding as well as the 
application of skills. The National Council of Teachers of Mathematics (NCTM) has 
been particularly vocal about this concern. In an NCTM publication (1991), Romberg 
(1989) cited that "a recent study of the six most commonly used commercial 
achievement tests found that at grade 8, on average, only 1 percent of the items were 
problem solving while 77 percent were computation or estimation" (p. 8). 

In order to best prepare their students for the standardized achievement tests, teachers 
usually devote much time to teaching the information which is found on the 
standardized tests. This is particularly true if the standardized tests are also used to 
measure an educator's teaching ability. The result of this pressure placed upon teachers 
for their students to perform well on these tests has resulted in an emphasis on low 
level skills in the classroom (Corbett & Wilson, 1991). With curriculum specialists and 
educational policy makers alike calling for more attention to higher level skills, these 
tests may be driving classroom practice in the opposite direction of educational reform. 

TEST INTERPRETATION 



As mentioned earlier, a student's performance on an NRT is interpreted in relation to the 
performance of a large group of similar students who took the test when it was first 
normed. For example, if a student receives a percentile rank score on the total test of 
34, this means that he or she performed as well or better than 34% of the students in 
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the norm group. This type of information can useful for deciding whether or not students 
need remedial assistance or is a candidate for a gifted program. However, the score 
gives little information about what the student actually knows or can do. The validity of 
the score in these decision processes depends on whether or not the content of the 
NRT matches the knowledge and skills expected of the students in that particular school 
system. 

It is easier to ensure the match to expected skills with a CRT. CRTs give detailed 
information about how well a student has performed on each of the educational goals or 
outcomes included on that test. For instance, "... a CRT score might describe which 
arithmetic operations a student can perform or the level of reading difficulty he or she 
can comprehend" (U.S. Congress, OTA, 1 992, p. 1 70). As long as the content of the 
test matches the content that is considered important to learn, the CRT gives the 
student, the teacher, and the parent more information about how much of the valued 
content has been learned than an NRT. 

SUMMARY 



Public demands for accountability, and consequently for high standardized tests scores, 
are not going to disappear. In 1994, thirty-one states administered NRTs, while 
thirty-three states administered CRTs. Among these states, twenty-two administered 
both. Only two states rely on NRTs exclusively, while one state relies exclusively on a 
CRT. Acknowledging the recommendations for educational reform and the popularity of 
standardized tests, some states are designing tests that "reflect, insofar as possible, 
what we believe to be appropriate educational practice" (NCTM, 1991, p.9). In addition 
to this, most states also administer other forms of assessment such as a writing sample, 
some form of open-ended performance assessment or a portfolio (CCSSO/NCREL, 
1994). 

Before a state can choose what type of standardized test to use, the state education 
officials will have to consider if that test meets three standards. These criteria are 
whether the assessment strategy(ies) of a particular test matches the state's 
educational goals, addresses the content the state wishes to assess, and allows the 
kinds of interpretations state education officials wish to make about student 
performance. Once they have determined these three things, the task of choosing 
between the NRT and CRT will becomes easier. 
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